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Simulating a binary black hole (BBH) coalescence by solving Einstein’s equations is computation¬ 
ally expensive, requiring days to months of supercomputing time. Using reduced order modeling 
techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a sec¬ 
ond, for numerical relativity (NR) waveforms from non-spinning BBH coalescences with mass ratios 
in [1,10] and durations corresponding to about 15 orbits before merger. We assess the model’s un¬ 
certainty and show that our modeling strategy predicts NR waveforms not used for the surrogate’s 
training with errors nearly as small as the numerical error of the NR code. Our model includes all 
spherical-harmonic -iYi m waveform modes resolved by the NR code up to l = 8. We compare our 
surrogate model to Effective One Body waveforms from 50-300Mq for advanced LIGO detectors and 
find that the surrogate is always more faithful (by at least an order of magnitude in most cases). 


Since the breakthroughs of 2005 [1-3], tremendous 
progress in numerical relativity (NR) has led to hun¬ 
dreds of simulations of binary black hole (BBH) coa¬ 
lescences [ 10]. This progress has been driven partly 

by data analysis needs of advanced ground-based grav¬ 
itational wave detectors like LIGO [L ] and Virgo [12]. 
Recent upgrades to these detectors are expected to yield 
the first direct detections of gravitational waves (GWs) 
from compact binary coalescences [13]. 

Despite the remarkable progress of the NR commu¬ 
nity, a single high-quality simulation typically requires 
days to months of supercomputing time. This high com¬ 
putational cost makes it difficult to directly use NR 
waveforms for data analysis, except for injection stud¬ 
ies [4, 9], since detecting GWs and inferring their source 
parameters may require thousands to millions of accu¬ 
rate gravitational waveforms. Nevertheless, a first tem¬ 
plate bank for nonspinning binaries in Advanced LIGO 
has been recently constructed from NR waveforms [14]. 
Furthermore, NR waveforms have been used successfully 
in calibrating inspiral-merger-ringdown (IMR) effective- 
one-body (EOB) [15-21] and phenomenological [22-25] 
models. These models have free parameters that can be 
set by matching to NR waveforms and are suitable for 
certain GW data analysis studies [26]. However, these 
models can have systematic errors since they assume a 
priori physical waveform structure and are calibrated and 
tested against a small set of NR simulations. 

In this Letter, we present an ab initio methodology 
based on surrogate [27, 28] and reduced order modeling 
techniques [29-33] that is capable of accurately predict¬ 
ing the gravitational waveform outputs from NR with¬ 
out any phenomenological assumptions or approxima¬ 
tions to general relativity. From a small set of specially 
selected non-spinning BBH simulations performed with 
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FIG. 1. Top: The + polarization (2,2) mode prediction 
for q = 2, the surrogate model’s worst prediction over q from 
a “leave-one-out” surrogate that was not trained with this 
waveform (see below). Our full surrogate, trained on the en¬ 
tire data set, is more accurate. Bottom: Phase Sip 2 ’ 2 and 
waveform differences between the surrogate and highest res¬ 
olution (Lev4) SpEC waveforms. Also shown is the SpEC 
numerical truncation error found by comparing the two high¬ 
est resolution (Lev4 and Lev3) waveforms. 


the Spectral Einstein Code (SpEC) [34-36], we build a 
surrogate model that can be used in place of performing 
SpEC simulations. The techniques are general, however, 
and directly apply to other NR codes or even analyti¬ 
cal waveform models. The surrogate model constructed 
here generates non-spinning BBH waveforms with mass 
ratios q £ [1,10], contains 25-31 gravitational wave cy¬ 
cles before peak amplitude, and includes many spherical- 
harmonic modes (see Table II and its caption). These 
choices are made based on available NR waveforms and 
are not limitations of the method. Our surrogate model 
has errors close to the estimated numerical error of the 
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input waveforms. An example comparing the surrogate 
output to an NR waveform can be seen in Fig. 1. This 
simulation took 9.3 days using 48 cores but only ~ 0.01 
sec for the surrogate evaluation of the (2,2) mode. 

Previous work [27, 37] built surrogates for EOB wave¬ 
forms; building and assessing surrogate models of NR 
waveforms have unique challenges associated with input 
waveforms that are expensive to compute. We summa¬ 
rize next the construction of our model, focusing on steps 
not addressed in [27] but are required for NR surrogates. 

Parametric sampling- Typically, a surrogate model is 
trained on a dense set of waveforms known as the train¬ 
ing set. In the case of NR, we cannot afford to gen¬ 
erate a large number of waveforms. Instead, we gener¬ 
ate a dense set of non-spinning waveforms using an EOB 
model [18], as implemented in [38], which contains the 
(£,m) = {(2, 2), (2,1), (3, 3), (4,4), (5, 5)} spin-weight -2 
spherical-harmonic modes and captures robust features 
of NR waveforms. The EOB training set waveforms are 
computed for times in [—2750,100]M (M is the total 
mass), which is the interval over which we build our sur¬ 
rogates. 

Next, on this training set we apply a greedy algorithm 
to expose the most relevant mass ratio values [39, 40]. 
The algorithm proceeds from a linear basis constructed 
from i waveforms already chosen. The L 2 norms of the 
differences between the training set waveforms and their 
projection onto this basis are computed. The waveform 
with the largest such error is added to the basis as its 
i + 1 element. SpEC simulations of non-spinning BBH 
mergers are then performed for these mass ratios. The 
resulting NR waveforms are used to build our surrogates 
without any further input from the EOB model. 

We seeded the greedy algorithm with 5 publicly avail¬ 
able SpEC simulations of non-spinning BBH mergers 
[10, 19] (see Table I), and the next 17 (ordered) mass ra¬ 
tio values are the algorithm’s output based on the EOB 
model. The final ~10 mass ratios are included to improve 
the surrogate if necessary, since we can assess the surro¬ 
gate model’s accuracy only after it is built. Our method 
for building surrogates is hierarchical [27, 40]; additional 
NR waveforms can be included to improve the model’s 
accuracy. 

Generating the NR waveforms- Table I summarizes 
the 22 SpEC simulations used in this paper. See, e.g., 
Ref. [35] for the numerical techniques used in SpEC. The 
numerical resolution is denoted by “Levi”, where i is 
an integer that controls the local truncation error in the 
metric and its derivatives allowed by adaptive mesh re¬ 
finement (AMR) in SpEC; larger numbers correspond to 
smaller errors (the error threshold scales like e~ l ) and 
more computationally-expensive simulations. The scal¬ 
ing of global quantities (e.g. waveform errors) with i is 
difficult to estimate a priori. Two to five levels of res¬ 
olution are simulated for each mass ratio. To achieve 
quasi-circular orbits, initial data are subject to an itera- 
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1 

180 

1.00 

5.1 

9867 

28.2 

12 

191 

2.51 

65 

6645 

22.5 

2 

181 

6.00 

5.8 

7056 

26.5 

13 

192 

6.58 

4.0 

5149 

21.1 

3 

182 

4.00 

12 

3840 

15.6 

14 

193 

3.50 

3.0 

5242 

19.6 

4 

183 

3.00 

4.8 

4008 

15.6 

15 

194 

1.52 

74 

5774 

19.6 

5 

184 

2.00 

15 

4201 

15.6 

16 

195 

7.76 

22 

5226 

21.9 

6 

185 

9.99 

31 

5817 

24.9 

17 

196 

9.66 

23 

5330 

23.1 

7 

186 

8.27 

16 

5687 

23.7 

18 

197 

5.52 

25 

5061 

20.3 

8 

187 

5.04 

3.0 

4807 

19.2 

19 

198 

1.20 

17 

6315 

20.7 

9 

188 

7.19 

15 

5439 

22.3 

20 

199 

8.73 

8.5 

5302 

22.6 

10 

189 

9.17 

13 

6019 

25.2 

21 

200 

3.27 

36 

5507 

20.2 

11 

190 

4.50 

2.5 

5199 

20.1 

22 

201 

2.32 

15 

5719 

20.0 


TABLE I. Properties of the highest resolution SpEC simula¬ 
tions used for building BBH waveform surrogates. The quan¬ 
tity e _5 is the orbital eccentricity divided by 10 5 [43]. The 
duration T/M and number of orbits (Orbs) are also given. 
The SpEC simulations are available in the public waveform 
catalog [10] under the name “SXS:BBH:/D.” 
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FIG. 2. The relative error, /if 2 — of successive 

resolutions SpEC Levi for the (2,2) mode of simulation 19 in 
Table I. Top: Waveform output as directly given by SpEC 
(“Unaligned”). Bottom: “Aligned,” which involves a multi- 
mode peak alignment scheme described by Eq. (2) followed 
by a rotation of the binary around the 2 -axis to align the 
waveform phases at t; = —2750 M. Our surrogate is built from 
NR waveform data after alignment, and so this measurement 
of truncation error is the most relevant for surrogate model 
building. 



tive eccentricity reduction procedure resulting in eccen¬ 
tricities <7x 10 -4 [ 11-43]. 

SpEC numerically solves an initial boundary value 
problem defined on a finite computational domain. To 
obtain waveforms at future null infinity , we use 
the Cauchy characteristic extraction (CCE) method [44- 
48]. Using the PittNull code [44-46], we compute the 
Newman-Penrose scalar T 4 at and finally obtain the 
gravitational wave strain h through two temporal inte¬ 
grations. We minimize the low-frequency, noise-induced 
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“drifts” [47] by using frequency cut-offs . 1 

Figure 2 shows the convergence typically observed in 
our simulations when using AMR. Because AMR makes 
independent decisions for different Lev?, a particular sub- 
domain may sometimes have the same number of grid 
points for two different values of Lev? at a given time, and 
the subdomain boundaries do not necessarily coincide for 
different Lev?. Thus, plots like Figure 2 sometimes show 
anomalously small differences between particular pairs 
of numerical resolutions (for instance Lev2 vs. Lev3 near 
t = —3500 M in the top panel of Figure 2). See Sec. IIIB 
of [35]. Nevertheless, the waveform differences generally 
decrease quickly with increasing resolution. Let 


Sh e ’ m (q) 


Ee, m \\h e r(-,qW 


(1) 


be the disagreement between two waveform modes h\ m 
and h 2 ’ m where \\h e ’ m (-, q)\\ 2 = J dt \h e ’ m (t ; q) | 2 . We esti¬ 
mate the numerical truncation error of each mode when 
hi and h 2 are waveforms computed at the two highest 
resolutions. The full waveform 2 error for a given mass 
ratio is 6h(q) = m dh e ’ m (q). We report numerical 
truncation errors after an overall simulation-dependent 
time shift and rotation (which we shall refer to as surro¬ 
gate alignment , described in the next section), which are 
physically unimportant coordinate changes. The result¬ 
ing estimated numerical truncation errors of the domi¬ 
nant ( 2 , 2 ) modes, using our surrogate alignment scheme, 
are shown in Fig. 3 (black circles). 

Additional error sources are non-zero eccentricity in 
the (intended to be circular) NR initial data, and 
an imperfect procedure for integrating 'F 4’ 71 to obtain 
h e,m = A e ’ m exp(— iip e ’ m ). These both cause small oscil¬ 
lations in the waveform amplitudes A ( ’ m (t) and phases 
ip e,m (t) [47, 50] that we model following [50]. We also 
compute the error in the strain integration scheme by 
comparing to two time derivatives of as well as 
estimates for numerical errors in the CCE method [48]. 
For the (2,2) mode, these additional errors are negligibly 
small compared to SpEC truncation errors (cf. Fig. 3). 

Preparing NR waveforms for surrogate modeling- We 
apply a simulation-dependent time shift and physical ro¬ 
tation about the z-axis so that all the modes’ phases are 
aligned. This reveals the underlying parametric smooth¬ 
ness in q that will be useful for building a surrogate. Our 


1 We integrate q /4 twice in the (dimensionless) frequency domain 
by dividing -^ 4 ,m (/) by [2?r max(/, 2/ 0 /3)] 2 , where f 0 is the 
initial GW mode frequency. 

2 Throughout, we exclude m = 0 modes because (non-oscillatory) 

Christodoulou memory is not accumulated sufficiently in current 

NR simulations [49]. 
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FIG. 3. Numerical truncation errors (black) dominate all 
other sources of error for the (2,2) mode, except for simula¬ 
tion 1 (q = 1), where the truncation errors are already very 
small. For some weaker modes, systematic amplitude oscilla¬ 
tions primarily due to eccentricity may become more relevant. 
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time shifts set each waveform’s total amplitude 



to be maximum at t = 0. After enforcing this alignment 
scheme we interpolate the waveform mode amplitudes 
and phases onto an array of uniformly spaced times in 
[—2750,100]M, with At = 0AM. Finally, we align the 
initial gravitational wave mode phases by performing a 
simulation-dependent, constant (in time) physical rota¬ 
tion about the ^-axis so that ip 2 ’ 2 (ti) = v? 2 ,_ 2 (ti), which 
fixes a physical rotation up to multiples of 7r. We resolve 
the ambiguity by requiring <yj 2,1 (t i ) £ (—7r, 0]. Waveform 
truncation errors, after performing this surrogate align¬ 
ment scheme, are shown in Fig. 2. In what follows, we 
call “truncation error after surrogate alignment” simply 
“truncation error.” 

Building the surrogate- Each m > 0 mode, h e ’ m (t;q), 
is modeled separately while (due to reflection symmetry 
about the orbital plane) m < 0 modes are evaluated us¬ 
ing h e ~ m (t;q) = l-lYh e ’ m (t;q)*. We model all m ± 0 
modes but keep only those yielding smaller surrogate er¬ 
rors compared to setting the mode to zero. Table II 
lists our modeled modes and their errors. 

Our complete surrogate waveform model is defined by 
hs(t, 9 , </>■ q) = h^ m (t; q)- 2 Y em (6, </>) where 

h^ m (t-,q) = A^ m (f,q)e~ i ^ m ^\ 

Nx 

q) = E B^(t)X^(q ), X = {A, ip}. 

i—1 

Unlike Ref. [27], we construct a reduced basis representa¬ 
tion for the waveform amplitudes and phases separately, 
instead of the waveforms themselves [37]. Here, the 
l are computed off-line from the SpEC wave¬ 
forms [27] . At a set of Nx specially selected times 
which are the empirical interpolant nodes 
[27, 51], the functions Xf m {q) « A^ m (T(U; 9 ) approx¬ 
imate the parametric variation of the amplitudes and 






4 


phases (via fitting). A thorough discussion of surrogate 
model building steps is presented in [27]. When evaluat¬ 
ing the surrogate at a particular mass ratio, the fits are 
evaluated first to determine the amplitudes and phases 
at their respective interpolating times The re¬ 

maining operations yield the surrogate model prediction, 

To find each Xf m (q) we perform least-squares fits to 
the 22 data points, {X^ m (Tj/[-; qj)} 2 = x . All fits except 
odd to mode amplitudes use 5th degree polynomials in 
the symmetric mass ratio, v = q/( 1 + q) 2 . For odd to 
modes, the amplitude approaches 0 and its derivative 
with respect to v diverges as q —> 1 (or v —> 1/4). Con¬ 
sequently, we use A\ m (v) = J2n=i/2 i a £ m (l — 4i/)” to 
account for this behavior. The waveform phases of odd 
to modes at q = 1, which are undefined, are excluded 
when fitting for each p\ m {q ). 

Assessing surrogate errors- We next assess the surro¬ 
gate’s predictive quality. To quantify the error in the 
surrogate model itself, as opposed to its usage in a data 
analysis study, we do not minimize the errors over rela¬ 
tive time and phase shifts here. 

A first test is a consistency check to reproduce the 
22 input SpEC waveforms used to build the surrogate. 
These errors are shown in Fig. 4 (red squares) and are 
comparable to or smaller than the largest SpEC trunca¬ 
tion errors (black circles). 

A more stringent test is the leave-one-out cross- 
validation (LOOCV) study [52]. For each simulated mass 
ratio qi, we build a temporary trial surrogate using the 
other 21 waveforms, evaluate the trial surrogate at g.j, 
and compare the prediction with the SpEC waveform for 
qi. Hence, the trial surrogate’s error at qi should serve 
as an upper bound for the full surrogate trained on all 
22 waveforms. Repeating this process for all possible 
20 LOOCV tests 3 results in Fig. 4 (blue triangles). De¬ 
spite the ith trial surrogate having no information about 
the waveform at the errors remain comparable to the 
largest SpEC truncation errors. The LOOCV errors are 
typically twice as large as the full surrogate ones confirm¬ 
ing the former as bounds for the latter. Relative errors 
for selected modes are shown in Table II. While weaker 
modes have larger relative errors, their power contribu¬ 
tion is small enough that the error in the full surrogate 
waveform, dh, is nearly identical to the SpEC resolution 
error. 

A third test is to compare the surrogate waveforms to 
those of a second surrogate, built from the second highest 
resolution SpEC waveforms. The resulting comparison is 
shown in Fig. 4 (cyan line). These errors are compara¬ 
ble to SpEC waveform truncation errors (black circles). 


(£,m) 

Surrogate 

Max Mean 

NR 

Max Mean 

(A, to) 

Surrogate 

Max Mean 

NR 

Max Mean 

(2,2) 

0.36 

0.07 

0.36 

0.08 

(3,2) 

100 

17 

1.7 

0.43 

(2,1) 

29 

3.4 

4.1 

0.54 

(4,4) 

7.4 

2.2 

20 

2.1 

(3,3) 

53 

4.1 

11 

0.94 

All 

0.42 

0.12 

0.40 

0.10 


TABLE II. Relative mode errors, reported as 10 3 x 
||/ig’ m (g) — h e,m (q)\\ 2 /\\h i,rn (q)\\ 2 , from the leave-one-out sur¬ 
rogates. Only those modes which contribute greater than 
0.02% to the full waveform’s time-domain power are used 
in the computation of the max and mean, except for ‘All’ 
which is just Sh. Our surrogate also includes the (3,1), 
(4, [2, 3]), (5, [3,4, 5]), (6, [4, 5, 6]), (7, [5,6, 7]), and (8, [7, 8]) 
modes. Weaker modes typically have relative errors between 
1% and 35%. 



FIG. 4. Waveform differences between the two highest SpEC 
resolutions (black circles), surrogates built from the two high¬ 
est SpEC resolutions (cyan line), the full surrogate and SpEC 
(red squares), and leave-one-out trial surrogates and SpEC 
(blue triangles). The largest surrogate error is for q = 2, for 
which the (2, 2) mode is shown in Fig. 1. 


We find that the surrogate building process is robust to 
resolution differences. Furthermore, the surrogate can be 
improved using NR waveforms of higher accuracy. 

We perform a final test and construct surrogates using 
the first N selected mass ratios (from Table I) as input 
waveforms, leaving 22 — N mass ratios with which to test. 
We find the total surrogate error decreases exponentially 
with N and is comparable to the SpEC truncation error 
after using 15 waveforms. Some modes (e.g., (2,2)) are 
fully resolved after as few as 7 waveforms. 

Comparison to EOB- For data analysis purposes, we 
compare our surrogate with EOBNRv2 [19] and SEOB- 
NRv2 [21] models (generated from a current implementa¬ 
tion 4 in LAL [38]). In Fig. 5, we show the unfaithfulness 


■ -.» t r d ^f-AXW-A<P + s<P)^>‘‘ (4) 

•».« Ash, «»(/) 


We omit the smallest and largest mass ratios here as the corre¬ 
sponding trial surrogates would extrapolate to their values. 


We find that very small changes (~10 —12 ) in the minimum fre¬ 
quency or the total mass can have unexpectedly large changes in 
the unfaithfulness (~10 —4 ) 
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FIG. 5. Unfaithfulness, from Eq. (4), comparing SpEC with 
our surrogate, EOBNRv2, and SEOBNRv2 models using all 
available m^O modes. Dashed lines show the unfaithfulness 
for (2, 2) modes only. All waveforms are Planck-tapered [54] 
for t € [-2750,-2500 ]M and t € [50,90]M. For the full 
multi-modal waveforms, we maximize the unfaithfulness over 
d and ip for the worst-case scenario. We use the “+” polariza¬ 
tion, which is non-zero for all ( 9 , ip). Left: The shaded regions 
contain all 22 mass ratios, while the dashed lines maximize 
over mass ratio. The vertical grey line is the minimum total 
mass («115 Mq) ensuring all (2,2) modes start with < 15Hz 
at the end of the first tapering window. Right: Unfaithful¬ 
ness for a 115 Mq binary. 

of the surrogate and the two EOB models against the 
NR waveforms. Here, h is the normalized Fourier trans¬ 
form of h (such that a waveform’s unfaithfulness with it¬ 
self gives 0), and S n (f) the advanced LIGO zero-detuned 
high power sensitivity noise curve [53]. The surrogate is 
more faithful than both EOB models for all cases con¬ 
sidered. Since SEOBNRv2 only provides (2, ±2) modes, 
it performs worst for large total masses where additional 
modes become important. All models predict the (2, 2) 
mode with an unfaithfulness < 1% for q £ [1,10] at 
115Mq, however the EOB models are limited by the 
availability of subdominant modes. 

Discussion- We have built a surrogate model for 
NR non-spinning BBH merger waveforms generated by 
SpEC. On a standard 2015 single core computer, all 
77 modes with 2 < £ < 8 are evaluated in ss 0.5 sec 
(« 0.01 sec for a single mode) providing a factor of 
~ 10 6-8 speedup compared to SpEC. Importantly, this 
is achieved with only a small loss in accuracy. Like other 
data-driven modeling strategies, our surrogate is valid 
only within the training intervals, namely, q £ [1,10] and 
t/M £ [—2570,100]. Therefore, within the training inter¬ 
vals, our surrogate model generates BBH merger wave¬ 
forms that are equivalent to SpEC outputs up to numer¬ 
ical error and a small modeling error. 

NR surrogates can be used for multiple-query applica¬ 
tions in gravitational wave data analysis such as detector- 
specific template-bank (re-)generation and parameter es¬ 
timation. Our surrogate, and more generally the results 
of this paper, open up the exciting possibility of perform¬ 


ing, for example, parameter estimation with multi-modal 
NR waveforms (with hybridization, if needed). Param¬ 
eter estimation studies seeking to incorporate model er¬ 
ror may benefit from the surrogate’s relatively straight¬ 
forward characterization and assessment of uncertainty 
from a combination of the surrogate’s and SpEC’s sys¬ 
tematic and numerical errors. We anticipate NR surro¬ 
gate modeling to complement traditional strategies [15- 
24, 26] by providing unlimited high-fidelity approxima¬ 
tions of NR waveforms with which to calibrate, refine and 
make comparisons. Building NR surrogates of precessing 
BBH merger waveforms, which may be modeled from the 
parameters specially selected in [55] , offer a promising av¬ 
enue for modeling the full 7 dimensional BBH parameter 
space. The surrogate model described in this paper is 
available for download at [56, 57]. 
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